perm filename XPAPER.OLD[TEX,DEK] blob
sn#841390 filedate 1987-06-13 generic text, type T, neo UTF8
% exercises for TeX: The Program
\hsize=6.5in \vsize=8.75in % TUGboat (cleared by BB, Jan 87)
\font\tenbxsl=cmbxsl10 % font for slanted copy in title
\font\eightrm=cmr8
% the following macros were used to make all the class handouts
\def\\#1{\hbox{\it#1\/\kern.05em}} % italic type for identifiers
\font\tentex=cmtex10 % TeX extended character set (used in strings)
\hyphenchar\tentt=-1 \hyphenchar\tentex=-1
\def\.#1{\hbox{\tentex % typewriter type for strings
\let\'=\RQ % right quote in a string
\let\`=\LQ % left quote in a string
#1}}
\def\LQ{{\tt\char'22}} % left quote in a string
\def\RQ{{\tt\char'23}} % right quote in a string
\def\O#1{\hbox{\rm\char'23\kern-.2em\it#1\/\kern.05em}} % octal constant
\def\begintt{$$\ttverbatim \catcode`\|=0 \parskip=0pt \ttfinish}
\chardef\other=12
\def\ttverbatim{\begingroup
\catcode`\\=\other \catcode`\{=\other \catcode`\}=\other
\catcode`\$=\other \catcode`\&=\other \catcode`\#=\other
\catcode`\%=\other \catcode`\~=\other \catcode`\_=\other
\catcode`\↑=\other
\obeyspaces \obeylines \tt}
{\obeyspaces\gdef {\ }}
{\catcode`\|=0 |catcode`|\=\other % | is temporary escape character
|obeylines % end of line is active
|gdef|ttfinish#1↑↑M#2\endtt{#1|vbox{#2}|endgroup$$}}
\def\|{{\tt\char`\|}}
\catcode`\|=\active \def|{\ttverbatim\let|=\endgroup}
\def\prob#1. {\medbreak\noindent\hbox to20pt{\bf #1.\hfil}}
\leftline{\bf Exercises for \tenbxsl \TeX: The Program}
\medskip
\leftline{\indent Donald E. Knuth}
\bigskip
\noindent
During the spring of 1987 I taught a course for which Volume~B of
{\sl Computers \& Typesetting\/} was the textbook. Since that book was
meant to serve primarily as a reference, not as a text, I needed
to supplement it with homework exercises and exam problems.
The problems turned out to rather interesting, and I think they might
be useful for self-study if anybody wants to learn {\sl \TeX: The
Program\/} without taking a college course.
Therefore I've collected them here and given what I think are
the correct answers.
The final problem, which deals with the typesetting
of languages that have large character sets, is especially noteworthy since
it presents an extension of \TeX\ that might prove to be useful in Asia.
Some of the problems suggested changes in the text. I've changed my original
wording of the problem statements so that they will make sense when the
next printing of the book comes out; people who have the first edition
should check the published errata before looking too closely at the questions
below.
\medbreak
\noindent{\bf The Problems}
\nobreak\smallskip\noindent
Here, then, are the exercises in the order I gave them. Although they begin
with a rather ``gentle introduction,''
I recommend that the first ones not be skipped, even if they may appear
too easy; there often is a slightly subtle point involved. Conversely,
some of the problems are real stumpers, but they are intended to teach
important lessons. A serious attempt should be made to solve each one
before turning to the answer, if the maximum benefit is to be achieved.
\prob 1. (An exercise about reading a |WEB|.) In the Pascal program defined
by the book, what immediately precedes `|PROCEDURE INITIALIZE|'? (Of
course it's a semicolon, but you should also figure out a few things that
occur immediately before that semicolon.)
\prob 2. Find an unnecessary macro in \S15.
\prob 3. Suppose that you want to make \TeX\ work in an environment where the
input file can contain two-character sequences of the form
`\\{esc}~$x$', where \\{esc}
is ASCII character number \O{33} and where $x$ is an ASCII character between
\.{\'@\'} and \.{\'\char`\_\'} inclusive. The result should be essentially
equivalent to what would have happened if the single (possibly nonprinting)
ASCII character $\\{chr}(\\{ord}(x)-\O{100})$ had been input instead.
If \\{esc} appears without being followed by an~$x$ in the desired range,
you should treat it as if the \\{esc} were ASCII character number \O{177}.
For example, the line `|A| \\{esc} |A| \\{esc} |a| \\{eoln}' should put
four codes into the buffer: \O{101}, \O{001}, \O{177}, \O{141}.
What system-dependent changes will handle this interface requirement?
\prob 4. Suppose that the string at the beginning of the \\{print\_roman\_int}
procedure were |"m2d5c2l2q5v5i"| instead of |"m2d5c2l5x2v5i"|. What
would printed from the input 69? From the input 9999?
\prob 5. Why does \\{error\_count} have a lower bound of~$-1$?
\prob 6. What is printed on the user's terminal after `|q|' is typed in
response to an error prompt? Why?
\prob 7. Give examples of how \TeX\ might fail in the following circumstances:
\smallskip\itemitem{a)}If the test `$t\le7230584$' were eliminated from \S108.
\smallskip\itemitem{b)}If the test `$s\ge1663497$' were eliminated from \S108.
\smallskip\itemitem{c)}If the test `$r>p+1$' were changed to `$r>p$' in \S127.
\smallskip\itemitem{d)}If the test `$\\{rlink}(p)\ne p$' were
eliminated from \S127.
\smallskip\itemitem{e)}If the test `$\\{lo\_mem\_max}+2\le\\{mem\_bot}+
\\{max\_halfword}$' were eliminated from \S125.
\prob 8. The purpose of this problem is to figure out what data in \\{mem}
could have generated the following output of \\{show\_node\_list}:
$$\vbox{\halign{#\hfil\qquad&\sevenrm\hfil#\cr
|\hbox(10.0+0.0)x100.0, glue set 10.0fill|&100\cr
|.\discretionary replacing 1|&200\cr
|..\kern 10.0|&300\cr
|.|\||\large U|&10000\cr
|.|\||\large ↑↑K (ligature ff)|&400\rlap{,\thinspace10001,\thinspace10002}\cr
|.\large !|&10003\cr
|.\penalty 5000|&500\cr
|.\glue 0.0 plus 1.0fill|&600\cr
|.\vbox(5.0x0.5)x10.0, shifted -5.0|&700\cr
|..\hbox(5.0x0.0)x10.0|&800\cr
|...\small d|&10004\cr
|...\small a|&10005\cr
|..\rule(0.5+0.0)x*|&900\cr}}$$
Assume that |\large| is font number~1 and that |\small| is font number~2. Also
assume that the nodes used in the lower (variable-size) part of \\{mem}
start in locations 100, 200, etc., as shown; the nodes used in the upper
(one-word) part of \\{mem} should appear in locations 10000, 10001, etc.
Make a diagram that illustrates the exact numeric contents of every
relevant \\{mem} word.
\prob 9. What will \\{short\_display} print, when given the horizontal
list inside the larger |\hbox| in the previous problem, assuming that
\\{font\_in\_short\_display} is initially zero?
\prob 10. Suppose the following commands are executed immediately
after \TeX\ has initialized itself:
$$\hbox{\\{incr}(\\{prev\_depth}); \
\\{decr}(\\{mode\_line}); \
\\{incr}(\\{prev\_graf}); \
\\{show\_activities}.}$$
What will be shown?
\prob 11. What will `\\{show\_eqtb}($\\{int\_base}+17$)' show,
after \TeX\ has initialized itself?
\prob 12. Suppose \TeX\ has been given the following definitions:
\begintt
\def\a{\advance\day by 1\relax} \def\g{\global\a}
\endtt
The effect of this inside \TeX\ will be that an appearance of\/ |\a| calls
$$\\{eq\_word\_define}(p,\\{eqtb}[p].\\{int}+1),$$ and an appearance of\/
|\g| calls
$\\{geq\_word\_define}(p,\\{eqtb}[p].\\{int}+1)$, where
$p=\\{int\_base}+\\{day\_code}$. Consider now the following commands:
\begintt
\day=0 \g\a{\a\g\a{\g\a\g}\a{\a}\a}
\endtt
Each `|{|' calls \\{new\_save\_level}(\\{simple\_group}), and each
`|}|' calls \\{unsave}.
Explain what gets pushed onto and popped off of the \\{save\_stack}, and what
gets stored in $\\{eqtb}[p]$ and $\\{xeq\_level}[p]$, as the above commands
are executed. What is the final value of\/ |\day|? \ (See {\sl The \TeX book},
exercise 15.9 and page 301.)
\prob 13. Use the notation at the bottom of page 122 in
{\sl \TeX: The Program\/} to describe
the contents of the token list corresponding to |\!| after the
definition
\begintt
\def\!!1#2![{!#]#!!2}
\endtt
has been given, assuming that
|[|, |]|, and |!| have the respective catcodes 1, 2, and~6, just as
|{|, |}|, and |#| do. \ (See exercise 20.7 in {\sl The \TeX book}.)
\prob 14. What is the absolute maximum number of characters that
will printed by \\{show\_eqtb}(\\{every\_par\_loc}), if the current value of
|\everypar| does not contain any control sequences? (Hint: The answer
exceeds~40. You may wish to verify this by running \TeX, defining an
appropriate worst-case example, and saying
\begintt
\tracingrestores=1 \tracingonline=1 {\everypar{}}
\endtt
since this will invoke \\{show\_eqtb} when |\everypar| is restored.)
\prob 15. What does {\tt INITEX} do with the following input line? (Look closely.)
\begintt
\catcode``=7 \`` `(``)```!
\endtt
\prob 16. Explain the error message you get if you say
\begintt
\endlinechar=`! \error
\endtt
in plain \TeX.
\prob 17. Fill in the missing macro definition so that the program
\begintt
\catcode`?=\active
\def\answer{...}
\answer
\endtt
will produce precisely the following error message when run with plain \TeX:
\begintt
! Undefined control sequence.
<recently read> How did this happen?
|noindent|null
l.3 \answer
|noindent|null
?
\endtt
(This problem is much harder than the others above, but there are
at least three ways to solve it!)
\prob 18. Consider what \TeX\ will do when it processes the following text:
\begintt
{\def\t{\gdef\a##}\catcode`d=12\t1d#2#3{#2}}
\hfuzz=100P\ifdim12pt=1P\expandafter\a
\expandafter\else\romannumeral888\relax\fi
\showthe\hfuzz \showlists
\endtt
(Assume that the category codes of plain \TeX\ are being used.)
Determine when the subroutines \\{scan\_keyword}, \\{scan\_int}, and
\\{scan\_dimen} are called as this text is being read, and explain
in general terms what results those subroutines produce.
\prob 19. What is the difference in interpretation, if any, between
the following two \TeX\ commands?
\begintt
\thickmuskip=-\thickmuskip
\thickmuskip=-\the\thickmuskip
\endtt
(Assume that plain \TeX\ is being used.) Explain why there is or isn't a
difference.
\prob 20. In what way would \TeX's behavior change if the
assignment at the end of \S508 were changed to `$b\gets(p=\\{null})$'\thinspace?
\prob 21. The initial implementation of \TeX82 had a much simpler
procedure in place of the one now in \S601:
$$\vbox{\halign{#\hfil\cr
{\bf procedure} \\{dvi\_pop};\cr
\quad{\bf begin if} $\\{dvi\_ptr}>0$ {\bf then}\cr
\qquad{\bf if} $\\{dvi\_buf}[\\{dvi\_ptr}-1]=\\{push}$ {\bf then}
\\{decr}(\\{dvi\_ptr})\cr
\qquad{\bf else} \\{dvi\_out}(\\{pop})\cr
\quad{\bf else} \\{dvi\_out}(\\{pop});\cr
\quad{\bf end};\cr}}$$
(No parameter $l$ was necessary.) Why did the author hang his head in shame
one day and change it to the form it now has?
\prob 22. Assign subscripts $d$, $y$, and $z$ to the sequence
of integers
$$2\,7\,1\,8\,2\,8\,1\,8\,2\,8\,4\,5\,9\,0\,4\,5$$
using the procedure sketched in \S604. (This is easy.)
\prob 23. Find a short \TeX\ program that will cause the
\\{print\_mode} subroutine to print `{\tt no mode}'.
(Do not assume that the category codes or macros of plain \TeX\ have
been preloaded.) Extra credit will be given to the person who has
the shortest program, i.e., the fewest tokens, among all correct
solutions submitted.
\prob 24. The textbook says in \S78 that \\{error} might be called within
\\{error} within a call of \\{error}, but the recursion cannot go any
deeper than this.
Construct a scenario in which \\{error} is entered three times
before it has been completed.
\prob 25. (The following question was the main problem on the midterm exam.)
Suppose {\tt WEB}'s conventions have been changed so that strings are
not identified by their number but rather by their starting position in
the \\{str\_pool} array. The \\{str\_start} array is therefore eliminated.
Strings of length~1 are still represented by their ASCII code values;
all other strings have values $\ge128$, and they appear in
the \\{pool\_file} just as before, in increasing order of starting position.
The special code `128' is assumed to terminate each string.
Thus, for example, if such a {\tt WEB} program uses just the three strings
{\tt"ab"}, {\tt""}, and {\tt"cd"} (in this order),
they will be represented in the corresponding Pascal code by the respective
integers 128, 131, and 132. The program in this case is expected to initialize
\\{str\_pool} locations 128--134 to the successive code values
97, 98, 128, 128, 99, 100, 128.
Such a modification requires lots of changes to \TeX.
Your job in this problem is to indicate what those changes should be.
However, you needn't specify a complete change file; just say how you would
modify \S39--\S48, \S59, \S259, \S407, \S464, and \S602 (if these sections
need to change at all). The other places where \\{str\_start} appears
can be changed in similar ways, and you needn't deal with those.
Some of the specified sections will require new code; you should supply
that code. Other sections may change only a little bit or not at all;
you should just give the grader sufficient explanation of what should
happen there.
\prob 26. Continuing problem 25,
discuss briefly whether or not it would be preferable
(a)~to store the length of
each string just before the first character, instead of using `128' just after
the last character; or (b)~to eliminate the extra `128' entirely and to save
space by adding 128 to the final character.
\prob 27. J. H. Quick (a student) thought he spotted a bug
in \S671 and he was all set to collect \$40.96 because of programs
like this:
\begintt
\vbox{\moveright 1pt\hbox to 2pt{}
\xleaders\lastbox\vskip 3pt}
\endtt
(He noticed that \TeX\ would give this vbox a width of 2\thinspace pt,
and he thought that the correct width was 3\thinspace pt.) However,
when he typed |\showlists| he saw that the leaders were simply
\begintt
\xleaders 3.0
.\hbox(0.0+0.0)x2.0
\endtt
and he noticed with regret the statement `$\\{shift\_amount}(\\{cur\_box})
\gets0$' in \S1081.
Explain how \S671 would have to be corrected, if
the \\{shift\_amount} of a leader box could be nonzero.
\eject % otherwise the page number is embarrassingly non-centered!
\begingroup
\hsize=3.5in
%\hbadness=-1
\prob 28. When your instructor made up this problem, he said
`|\hbadness=-1|' so that \TeX\ would print out the way each line of
this paragraph was broken. (He sometimes wants to check line breaks
without looking at actual output, when he's using a terminal that
has no display capabilities.) It turned out that \TeX\ typed this:
\endgroup\begingroup\vskip-6pt
\begintt
Loose \hbox (badness 0) in paragraph at lines 11--16
[]\tenrm When your instructor made up this problem, he said
|medskip
Tight \hbox (badness 3) in paragraph at lines 11--16
\tenrm `\tentt \hbadness=-1\tenrm ' so that T[]X would print out the way each
|medskip
Tight \hbox (badness 20) in paragraph at lines 11--16
\tenrm line of this paragraph was broken. (He sometimes wants to
|medskip
Loose \hbox (badness 1) in paragraph at lines 11--16
\tenrm check line breaks without looking at actual output, when
|medskip
Loose \hbox (badness 1) in paragraph at lines 11--16
\tenrm he's using a terminal that has no display capabilities.) It
\endtt
Why wasn't anything shown for the last line of the paragraph?
\endgroup
\prob 29. How would the output of \TeX\ look different if the \\{rebox}
procedure were changed by deleting the statement
`{\bf if} $\\{type}(b)=\\{vlist\_node}$ {\bf then} $b\gets\\{hpack}(b,
\\{natural})$'?
How would the output look different if the next conditional statement,
`{\bf if} $(\\{is\_char\_node}(p))$ \dots' were deleted?
(Note that box~$b$ might have been formed by \\{char\_box}.)
\prob 30. What spacing does \TeX\ insert between the characters
when it typesets the formulas |$x==1$|, |$x++1$|, and |$x,,1$|?
Find the places in the program where these spacing decisions are made.
\begingroup
\hsize=3.18in
\tracingparagraphs=1 \pretolerance=-1
\prob 31. When your instructor made up this problem, he said
`|\tracingparagraphs=1|' so that his transcript file would explain why
\TeX\ has broken the paragraph into lines in a particular way. He also said
`|\pretolerance=-1|' so that hyphenation would be tried immediately.
The output is shown on the next page; use it to determine
what line breaks would have been found by a simpler algorithm
that breaks one line at a time. (The simpler algorithm finds the
breakpoint that yields fewest demerits on the first line,
then chooses it and starts over again.)
\endgroup
\pageinsert
\font\ninett=cmtt9
\vbox to\vsize{\parindent=0pt\obeylines\let\tt=\ninett
\baselineskip=10pt plus 1pt
|% This is the paragraph-trace output referred to in Problem 30:|
|[]\tenrm When your in-struc-tor made up this prob-lem, he |
|@ via @@0 b=0 p=0 d=100|
|@@1: line 1.2 t=100 -> @@0|
|said `\tentt \tracingparagraphs=1\tenrm ' so that his tran-script |
|@ via @@1 b=4 p=0 d=196|
|@@2: line 2.2 t=296 -> @@1|
|file would ex-plain why T[]X has bro-ken the para-|
|@\discretionary via @@2 b=175 p=50 d=46725|
|@@3: line 3.0- t=47021 -> @@2|
|graph |
|@ via @@2 b=25 p=0 d=1225|
|@@4: line 3.3 t=1521 -> @@2|
|into lines in a par-tic-u-lar way. He also said |
|@ via @@3 b=69 p=0 d=6241|
|@@5: line 4.1 t=53262 -> @@3|
|`\tentt \pretolerance=-1\tenrm ' so that hy-phen-ation would be |
|@ via @@5 b=43 p=0 d=2809|
|@@6: line 5.1 t=56071 -> @@5|
|tried im-me-di-ately. The out-put is shown on the next |
|@ via @@6 b=0 p=0 d=100|
|@@7: line 6.2 t=56171 -> @@6|
|page; use it to de-ter-mine what line breaks would |
|@ via @@7 b=153 p=0 d=36569|
|@@8: line 7.0 t=92740 -> @@7|
|have |
|@ via @@7 b=34 p=0 d=1936|
|@@9: line 7.3 t=58107 -> @@7|
|been found by a sim-pler al-go-rithm that breaks |
|@ via @@8 b=1 p=0 d=10121|
|@@10: line 8.2 t=102861 -> @@8|
|one |
|@ via @@9 b=15 p=0 d=10625|
|@@11: line 8.1 t=68732 -> @@9|
|line at a time. (The sim-pler al-go-rithm finds |
|@ via @@10 b=164 p=0 d=40276|
|@@12: line 9.0 t=143137 -> @@10|
|the |
|@ via @@10 b=0 p=0 d=100|
|@ via @@11 b=192 p=0 d=40804|
|@@13: line 9.0 t=109536 -> @@11|
|@@14: line 9.2 t=102961 -> @@10|
|break-point that yields fewest de-mer-its on the |
|@ via @@12 b=174 p=0 d=33856|
|@@15: line 10.0 t=176993 -> @@12|
|first |
|@ via @@12 b=41 p=0 d=12601|
|@ via @@13 b=75 p=0 d=7225|
|@ via @@14 b=75 p=0 d=7225|
|@@16: line 10.1 t=110186 -> @@14|
|line, then chooses it and starts over again.) |
|@\par via @@15 b=0 p=-10000 d=10100|
|@\par via @@16 b=0 p=-10000 d=100|
|@@17: line 11.2- t=110286 -> @@16|
}
\endinsert
\prob 32. Play through the algorithms in parts 42 and 43, to figure
out the contents of \\{trie\_op}, \\{trie\_char}, \\{trie\_link},
\\{hyf\_distance}, \\{hyf\_num}, and \\{hyf\_next} after the statement
\begintt
\patterns{a1bc 2bcd3 ab1cd}
\endtt
has been processed. Then execute the algorithm of \S923, to see
how \TeX\ uses this efficient trie structure
to set the values of \\{hyf} when the word |aabcd| is hyphenated.
[The value of \\{hn} will be~5, and the values of \\{hc}[$1\,.\,.\,5$]
will be $(96,96,97,98,99)$, respectively, when \S923 begins.]
\prob 33. The \\{save\_stack} is normally empty when
a \TeX\ program stops. But if,
say, the user's input has an extra `|{|' (or a missing `|}|'), \TeX\
will print the warning message
\begintt
(\end occurred inside a group at level 1)
\endtt
(see \S1335).
\goodbreak
Explain in detail how to change \TeX\ so that such warning messages will be
more explicit. For example, if the source program has an unmatched `|{|'
on line~6 and an unmatched `|\begingroup|' on line~25, your modified \TeX\
should give two warnings:
\begintt
(\end occurred when \begingroup on line 25 was incomplete)
(\end occurred when { on line 6 was incomplete)
\endtt
You may assume that \\{simple\_group} and \\{semi\_simple\_group} are the
only group codes present on \\{save\_stack} when \S1335 is encountered;
if other group codes are present, your program should call \\{confusion}.
\prob 34. (The following question is the most difficult yet most important
of the entire collection. It was the main problem on the take-home final exam.)
\def\TeXX{\TeX\kern-.3emX}
The purpose of this problem is to extend \TeX\ so that it will sell better
in China and Japan. The extended program, called \TeXX, allows each font
to contain up to 65536 characters. Each extended character is represented
by two values, its `extension'~$x$ and its `code'~$c$, where both $x$ and~$c$
lie between 0 and~255 inclusive. Characters with the same `$c$' but different
`$x$' correspond to different graphics; but they have the same width, height,
depth, and italic correction.
\TeXX\ is identical to \TeX\ except that it has one new primitive command:
|\xchar|. If\/ |\xchar| occurs in vertical mode, it begins a new paragraph;
i.e., it's a $\langle$horizontal command$\rangle$ as on p.~283 of {\sl The
\TeX book}. If\/ |\xchar| occurs in horizontal mode it should be followed
by a $\langle$number$\rangle$ between 0 and 65535; this number can be converted
to the form $256x+c$, where $0\le x,c<256$. The corresponding extended character
from the current font will be appended to the current horizontal list, and the
space factor will be set to 1000.
(If $x=0$, the effect of\/ |\xchar| is something like the effect of\/ |\char|,
except that |\xchar| disables ligatures and kerns and it doesn't do anything
special to the space factor. Moreover, no penalty is inserted
after an |\xchar| that happens to be the |\hyphenchar| of the current font.)
A word containing an extended character will not be hyphenated.
The |\xchar| command should not occur in math mode.
Inside \TeXX, an extended character $(x,c)$ in font~$f$ is represented by
two consecutive \\{char\_node} items $p$ and~$q$, where we have
$\\{font}(p)=\\{null\_font}$, $\\{character}(p)=\\{qi}(x)$, $\\{link}(p)=q$,
$\\{font}(q)=f$, and $\\{character}(q)=\\{qi}(c)$. This two-word representation
is used even when $x=0$.
\TeXX\ typesets an extended character by specifying character number $256x+c$
in the |DVI| file. (See the \\{set2} command in \S585.)
If \TeXX\ is run with the macros of plain \TeX, and if the user types
`|\tracingall| |\xchar600| |\showlists|', the output of \TeXX\ will
include
\begintt
{\xchar}
{horizontal mode: \xchar}
{\showlists}
|noindent|null
### horizontal mode entered at line 0
\hbox(0.0+0.0)x20.0
\tenrm \xchar"258
spacefactor 1000
\endtt
(since 600 is |"258| in hexadecimal notation).
Your job is to explain in detail {\it all\/} changes to \TeX\ that are
necessary to convert it to \TeXX.
[Note: A properly designed extension would also include the primitive operator
|\xchardef|, analogous to |\chardef| and |\mathdef|, because a language
should be `orthogonally complete'.
However, this additional extension has not been included as part
of problem~34, because it presents no special difficulties. Anybody who
can figure out how to implement |\xchar| can certainly also handle |\xchardef|.]
\prob 35. The first edition of {\sl \TeX: The Program\/} suggested that
extended characters could be represented with the following convention:
The first of two consecutive \\{char\_node} items was to contain the font
code and a character code from which the dimensions could be computed
as usual; the second \\{char\_node} was a \\{halfword} giving the actual
character number to be typeset. Fonts were divided into two types,
based on characteristics of their |TFM| headers; `oriental' fonts always
used this two-word representation, other fonts always used the one-word
representation.
Explain why the method suggested in problem 34 is better than this.
(There are at least two reasons.)
% macros copied from WEBMAC will be used for the rest of this paper!
\let\sec=\S
\begingroup
\catcode`\@=\other
\catcode`\"=\other
\parskip 0pt % no stretch between paragraphs
\def\\#1{\hbox{\it#1\/\kern.05em}} % italic type for identifiers
\def\|#1{\hbox{$#1$}} % one-letter identifiers look a bit better this way
\def\{\hbox{\bf#1\/}} % boldface type for reserved words
\def\.#1{\hbox{\tentex % typewriter type for strings
\let\\=\BS % backslash in a string
\let\'=\RQ % right quote in a string
\let\`=\LQ % left quote in a string
\let\{=\LB % left brace in a string
\let\}=\RB % right brace in a string
\let\~=\TL % tilde in a string
\let\ =\SP % space in a string
\let\_=\UL % underline in a string
\let\&=\AM % ampersand in a string
#1}}
\def\#{\hbox{\tt\char`\#}} % parameter sign
\def\${\hbox{\tt\char`\$}} % dollar sign
\def\%{\hbox{\tt\char`\%}} % percent sign
\def\↑{\ifmmode\mathchar"222 \else\char`↑ \fi} % pointer or hat
% circumflex accents can be obtained from \↑↑D instead of \↑
\chardef\AM=`\& % ampersand character in a string
\chardef\BS=`\\ % backslash in a string
\chardef\LB=`\{ % left brace in a string
\def\LQ{{\tt\char'22}} % left quote in a string
\chardef\RB=`\} % right brace in a string
\def\RQ{{\tt\char'23}} % right quote in a string
\def\SP{{\tt\char`\ }} % (visible) space in a string
\chardef\TL=`\~ % tilde in a string
\chardef\UL=`\_ % underline character in a string
\newbox\bak \setbox\bak=\hbox to -1em{} % backspace one em
\newbox\bakk\setbox\bakk=\hbox to -2em{} % backspace two ems
\newcount\ind % current indentation in ems
\def\1{\global\advance\ind by1\hangindent\ind em} % indent one more notch
\def\2{\global\advance\ind by-1} % indent one less notch
\def\3#1{\hfil\penalty#10\hfilneg} % optional break within a statement
\def\4{\copy\bak} % backspace one notch
\def\5{\hfil\penalty-1\hfilneg\kern2.5em\copy\bakk\ignorespaces}% optional break
\def\6{\ifmmode\else\par % forced break
\hangindent\ind em\noindent\kern\ind em\copy\bakk\ignorespaces\fi}
\def\7{\Y\6} % forced break and a little extra space
\let\yskip=\smallskip
\def\to{\mathrel{.\,.}} % double dot, used only in math mode
\def\note#1#2.{\Y\noindent{\hangindent2em\baselineskip10pt\eightrm#1 #2.\par}}
\def\startsection{\Q\noindent\strut{\bf\modno.\quad}}
\def\defin#1{\global\advance\ind by 2 \1\&{#1 }} % begin `define' or `format'
\def\A{\note{See also}} % cross-reference for multiply defined section names
\def\B{\mathopen{\.{@\{}}} % begin controlled comment
\def\C#1{\ifmmode\gdef\XX{\null$\null}\else\gdef\XX{}\fi % Pascal comments
\XX\hfil\penalty-1\hfilneg\quad$\{\,$#1$\,\}$\XX}
\def\D{\defin{define}} % macro definition
\def\E{\cdot10↑} % exponent in floating point constant
\def\F{\defin{format}} % format definition
\let\G=\ge % greater than or equal sign
\def\H#1{\hbox{\rm\char"7D\tt#1}} % hexadecimal constant
\let\I=\ne % unequal sign
\def\J{\.{@\&}} % TANGLE's join operation
\let\K=\gets % left arrow
\let\L=\le % less than or equal sign
\outer\def\M#1.{\bigbreak\def\modno{#1}\startsection\ignorespaces\iftrue}
\def\O#1{\hbox{\rm\char'23\kern-.2em\it#1\/\kern.05em}} % octal constant
\def\P{\rightskip=0pt plus 100pt minus 10pt % go into Pascal mode
\parindent 1em % for paragraphs and for the first line of Pascal text
\sfcode`;=3000
\pretolerance 10000
\hyphenpenalty 10000 \exhyphenpenalty 10000
\global\ind=2 \1\ \unskip}
\def\Q{\rightskip=0pt % get out of Pascal mode
\parindent20pt
\sfcode`;=1500 \pretolerance 200 \hyphenpenalty 50 \exhyphenpenalty 50 }
\let\R=\lnot % logical not
\let\S=\equiv % equivalence sign
\def\T{\mathclose{\.{@\}}}} % terminate controlled comment
\def\U{\note{This code is used in}} % cross-reference for uses of sections
\let\V=\lor % logical or
\let\W=\land % logical and
\def\X#1:#2\X{\ifmmode\gdef\XX{\null$\null}\else\gdef\XX{}\fi % section name
\XX$\langle\,$#2{\eightrm\kern.5em#1}$\,\rangle$\XX}
\def\Y{\par\yskip}
\let\Z=\let % now you can \send the control sequence \Z
\def\){\hbox{\.{@\$}}} % sign for string pool check sum
\def\]{\hbox{\.{@\\}}} % sign for forced line break
\def\=#1{\kern2pt\hbox{\vrule\vtop{\vbox{\hrule
\hbox{\strut\kern2pt\.{#1}\kern2pt}}
\hrule}\vrule}\kern2pt} % verbatim string
\let\~=\ignorespaces
\def\ellipsis{\kern5em\smash{\vdots}\qquad\vbox to12pt{}\par}
\def\hang{\hangindent 3em\noindent\ignorespaces}
\medbreak
\noindent{\bf The Answers}
\prob 1. According to the index, \\{initialize} is declared in \sec4.
It is preceded there by \X13:Global variables\X, and \sec13 tells us that
the final global variable appears in \sec1345. Turning to \sec1345, we
find `\\{write\_loc}: \\{pointer};' and a comment. The comment doesn't
get into the Pascal code. The mini-index at the bottom of page~535
tells us that `\\{pointer}' is a macro defined in \sec115. Our quest
is nearly over, since \sec115 says that \\{pointer} expands to \\{halfword},
which is part of the Pascal program. Page~ix tells us that lowercase letters
of a |WEB| program become uppercase in the corresponding Pascal code;
page~x tells us that the underline in `\\{write\_loc}' is discarded.
Therefore we conclude that `|PROCEDURE| |INITIALIZE|' is immediately
preceded in the Pascal program by `|WRITELOC:HALFWORD;|'.
But this isn't quite correct! The book doesn't tell the whole
story. If we actually run |TANGLE| on |TEX.WEB| (without a change file),
we find that `|PROCEDURE| |INITIALIZE|' is actually preceded by
\begintt
{1345:}WRITELOC:HALFWORD;{:1345}
\endtt
because |TANGLE| inserts comments to show the origin of each block of code.
\prob 2. The index tells us that \\{done5} and \\{done6} are never used.
(They are included only for people who have to make system-dependent
changes and/or extensions.)
\prob 3. Here we change the \\{input\_ln} procedure of \sec31. One
way is to replace the statements `$\\{buffer}[\\{last}]\K\\{xord}[f\↑]$;
$\\{get}(f)$' by the following:
\Y\P\1\1\qquad
\&{if} $\\{ord}(\|f\↑)=\O{33}$ \1\&{then}\6
\&{begin} \37$\\{get}(\|f)$;\6
\&{if} $(\\{ord}(\|f\↑)\G\.{"@"})\W(\\{ord}(\|f\↑)\L\.{"\_"})$ \1\&{then}\6
\&{begin} \37$\\{buffer}[\\{last}]\K\\{xord}[\\{chr}(\\{ord}(\|f\↑)-\O{100})]$;%
\5
$\\{get}(\|f)$;\6
\&{end}\6
\4\&{else} $\\{buffer}[\\{last}]\K\\{invalid\_code}$;\2\6
\&{end}\6
\4\&{else} \&{begin} \37$\\{buffer}[\\{last}]\K\\{xord}[\|f\↑]$;\5
$\\{get}(\|f)$;\6
\&{end};\par\Q
\prob 4. The new string essentially substitutes ``quarters''~|q| (of
value~25) for ``dimes''~|x| (of value~10). Playing through the code of\/ \sec69
tells us that 69 is now represented by |lvvviv| and 9999 is |mmmmmmmmmcmqcvqiv|.
(The first nine |m|'s make 9000; then |cm| makes 900; then |qc| makes~75;
then |vq| makes~20; and |iv| makes the remaining~4.)
\prob 5. Because it may be decreased by~1 in \sec1293 before being increased
by~1 in~\sec82.
(The code in \sec1293 decreases \\{error\_count} because ``showing'' uses the
\\{error} subroutine although it isn't really an error.)
\prob 6. The |q| becomes |Q| in \sec83. This causes \sec86 to print
`|OK, entering \batchmode|', after which \\{selector} is decreased so
that `|...|' and $\langle$return$\rangle$ are {\it not\/} printed
on the terminal! (They appear only in the log file, if it has been opened.)
This is \TeX's way of confirming that |\batchmode| has indeed been entered.
\prob 7. (a)~Arithmetic overflow might occur when computing $t\ast297$,
because $7230585\times297=2↑{31}+97$. (b)~Some sort of test is need to
avoid division by zero when $0<s<297$. If $s<1663497$ then $s$~{\bf div}~$297<
5601$, and $7230585/5600$ is a bit larger than 1291 so we will have
$r>1290$ in such a case. The threshold value has therefore been chosen
to save division whenever possible. (One student suggested that the
statement `$r\K t$' be replaced by `$r\K1291$'. That might or might not
be faster, depending on the computer and the Pascal compiler. In machine
language one would `{\bf goto}' the statement that sets $\\{badness}\K
\\{inf\_bad}$, but that~is~inadmissible Pascal.)
(c)~If we get to \sec128 with $r=p+1$, we will try to make a node of size~1,
but~then there's no room for the \\{node\_size} field. (d)~If we get to
\sec129 with only one node available, we'll lose everything and \\{rover}
will be invalid. (Older versions of \TeX\ have a more complicated test
in \sec127, which would suppress going to \sec129 if there were two nodes
available. That was unnecessarily cautious.) (e)~This is a subtle one.
The lower part of memory must not be allowed to grow so large that
a \\{node\_size} value could ever exceed \\{max\_halfword} when nodes
are being merged together in \sec127.
\prob 8. We assume that $\\{min\_quarterword}=\\{min\_halfword}=0$.
$$\def\|#1{\tt\hfil#1 } % copy inside a box
\let\,=\thinspace
\setbox\strutbox=\hbox{\vrule height9pt depth3pt width0pt}
\def\g{\hfil\ignorespaces} % for floating point
\def\qqh#1#2#3{\vtop{\hbox{%
\hbox to 30pt{\|{#1}\kern-.2pt\vrule\kern-.2pt\strut}%
\hbox to 30pt{\|{#2}\kern-.2pt\vrule\kern-.2pt\strut}%
\hbox to 60pt{\|{#3}}}\kern-.2pt\hrule\kern-.2pt}}
\def\hh#1#2{\vtop{\hbox{%
\hbox to 60pt{\|{#1}\kern-.2pt\vrule\kern-.2pt\strut}%
\hbox to 60pt{\|{#2}}}\kern-.2pt\hrule\kern-.2pt}}
\def\w#1{\vtop{\hbox{\strut
\hbox to 120pt{\|{#1}}}\kern-.2pt\hrule\kern-.2pt}}
\def\topline{\noalign{\vskip-\baselineskip}
&\omit&\w{}\cr}
\halign{&\strut\kern20pt\tt#\,\hfil&\vrule#\kern-.2pt&
#&\kern-.2pt\vrule#&\quad#\hfil\cr
\topline
100:&&\qqh000&&\\{type} (\\{hlist\_node}), , \\{link}\cr
101:&&\w{6553600}&&\\{width} (100\,pt)\cr
102:&&\w0&&\\{depth}\cr
103:&&\w{655360}&&\\{height} (10\,pt)\cr
104:&&\w0&&\\{shift\_amount}\cr
105:&&\qqh12{200}&&
\\{glue\_sign} (\\{stretching}), \\{glue\_order} (\\{fill}), \\{list\_ptr}\cr
106:&&\w{10.0\g}&&\\{glue\_set} (type \\{real})\cr
\noalign{\medskip}\topline
200:&&\qqh71{10003}&&\\{type} (\\{disc\_node}), \\{replace\_count}, \\{link}\cr
201:&&\hh{300}{10000}&&\\{pre\_break}, \\{post\_break}\cr
\noalign{\medskip}\topline
300:&&\qqh{11}10&&\\{type} (\\{kern\_node}), \\{subtype} (\\{explicit}), %
\\{link}\cr
301:&&\w{655360}&&\\{width} (10\,pt)\cr
\noalign{\medskip}\topline
400:&&\qqh600&&\\{type} (\\{ligature\_node}), , \\{link}\cr
401:&&\qqh1{11}{10001}&&\\{font}, \\{character}, \\{lig\_ptr}\cr
\noalign{\medskip}\topline
500:&&\qqh{12}0{600}&&\\{type} (\\{penalty\_node}), , \\{link}\cr
501:&&\w{5000}&&\\{penalty}\cr
\noalign{\medskip}\topline
600:&&\qqh{10}0{700}&&\\{type} (\\{glue\_node}), \\{subtype} (\\{normal}), \\{link}\cr
601:&&\hh80&&\\{glue\_ptr} (\\{fill\_glue}), \\{leader\_ptr}\cr
\noalign{\medskip}\topline
700:&&\qqh100&&\\{type} (\\{vlist\_node}), , \\{link}\cr
701:&&\w{655360}&&\\{width} (10\,pt)\cr
702:&&\w{32768}&&\\{depth} (0.5\,pt)\cr
703:&&\w{327680}&&\\{height} (5\,pt)\cr
704:&&\w{-327680}&&\\{shift\_amount} ($-5$\,pt)\cr
705:&&\qqh00{800}&&
\\{glue\_sign} (\\{normal}), \\{glue\_order} (\\{normal}), \\{list\_ptr}\cr
706:&&\w{0.0\g}&&\\{glue\_set} (type \\{real})\cr
\noalign{\medskip}\topline
800:&&\qqh00{900}&&\\{type} (\\{hlist\_node}), , \\{link}\cr
801:&&\w{655360}&&\\{width} (10\,pt)\cr
802:&&\w0&&\\{depth}\cr
803:&&\w{327680}&&\\{height} (5\,pt)\cr
804:&&\w0&&\\{shift\_amount}\cr
805:&&\qqh00{10004}&&
\\{glue\_sign} (\\{normal}), \\{glue\_order} (\\{normal}), \\{list\_ptr}\cr
806:&&\w{0.0\g}&&\\{glue\_set} (type \\{real})\cr
\noalign{\medskip}\topline
900:&&\qqh200&&\\{type} (\\{rule\_node}), , \\{link}\cr
901:&&\w{-1073741824}&&\\{width} (\\{null\_flag})\cr
902:&&\w0&&\\{depth}\cr
903:&&\w{32768}&&\\{height} (0.5\,pt)\cr
\noalign{\medskip}\topline
\llap{10}000:&&\qqh1{"U"}{400}&&\\{font}, \\{character}, \\{link}\cr
\llap{10}001:&&\qqh1{"f"}{10002}&&\\{font}, \\{character}, \\{link}\cr
\llap{10}002:&&\qqh1{"f"}0&&\\{font}, \\{character}, \\{link}\cr
\llap{10}003:&&\qqh1{"!"}{500}&&\\{font}, \\{character}, \\{link}\cr
\llap{10}004:&&\qqh2{"d"}{10005}&&\\{font}, \\{character}, \\{link}\cr
\llap{10}005:&&\qqh2{"a"}0&&\\{font}, \\{character}, \\{link}\cr
}$$
\prob 9. (Norwegian Americans will recognize this as an `Uff da' joke.) The
output of \\{short\_display} is
\begintt
\large Uff []
\endtt
since \\{short\_display} shows the pre-break and post-break parts of a
discretionary (but not the replacement text). However, if this box were
output by \\{hlist\_out}, the discretionary break would not be effective;
the result would be a box 100$\,$pt wide, beginning with a large `!' and
ending with a small `da', the latter being raised 5$\,$pt and underlined
with a 0.5$\,$pt-rule.
\prob 10. Since \\{prev\_depth} is initially \\{ignore\_depth}, we get
\begintt
### vertical mode entered at line 1 (\output routine)
prevdepth -999.99998, prevgraf 1 line
\endtt
\prob 11. According to \sec236, $\\{int\_base}+17$ is where \\{mag} is
stored. (One of the definitions suppressed by an ellipsis on page 101 is
\\{mag}; you can verify this by checking the index!) The initial value
of \\{mag} is set in \sec240. Hence \\{show\_eqtb} branches to \sec242 and
prints `|\mag=1000|'.
\prob 12.
\prob 30. The behavior of the simpler algorithm, which we may call Brand~X,
can be deduced from the demerits values (`|d=|') in the trace output.
There is only one reasonable choice, |@@1|, for the first line; and there's
only one, |@@2|, for the second. But for the third, a line from |@@2| to
|@@3| (the break after `|para-|') has 46725 demerits, which certainly looks
worse than the 1225 demerits from |@@2| to |@@4|. This, however, leads Brand~X
into a trap, since there's no good way to continue from |@@4|. Similarly,
Brand~X will choose to go from |@@7| to |@@9|, and this forces it to
|@@11| and then infelicitously to |@@13| (because the syllable `|break-|'
is too long to be squeezed in). The resulting paragraph, as typeset by
Brand~X, looks like this:
$$\vbox{\hbadness=2000
\hsize=3.18in
\pretolerance=-1
\prob 30. When your instructor made up this problem, he said
`|\tracingparagraphs=1|' so that his transcript file would explain why
\TeX\ has broken the paragraph\break into lines in a particular way. He also said\break
`|\pretolerance=-1|' so that hyphenation would be tried immediately.
The output is shown on the next page; use it to determine
what line breaks would have\break been found by a simpler algorithm
that breaks one line at a time. (The simpler algorithm finds the
breakpoint that yields fewest demerits on the first line,
then chooses it and starts over again.)}$$
\endgroup\bye
\bye